OS-level hang detection in complex software systems

نویسندگان

  • Antonio Bovenzi
  • Marcello Cinque
  • Domenico Cotroneo
  • Roberto Natella
  • Gabriella Carrozza
چکیده

Many critical services are nowadays provided by large and complex software systems. However the increasing complexity introduces several sources of non-determinism, which may lead to hang failures: the system appears to be running, but part of its services are perceived as unresponsive. On-line monitoring is the only way to detect and to promptly react to such failures. However, when dealing with Off-The-Shelf based systems, on-line detection can be tricky since instrumentation and log data collection may not be feasible in practice. In this paper, a detection framework to cope with software hangs is proposed. The framework enables the non-intrusive monitoring of complex systems, based on multiple sources of data gathered at the Operating System (OS) level. Collected data are then combined to reveal hang failures. The framework is evaluated through a fault injection campaign on two complex systems from the Air Traffic Management (ATM) domain. Results show that the combination of several monitors at the OS level is effective to detect hang failures in terms of coverage and false positives and with a negligible impact on performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Operating-System-Level Framework for Providing Application-Aware Reliability

Operating systems enable collecting and extracting rich information on application execution characteristics, including program counter traces, memory access patterns, and operating-system-generated signals. This information can be exploited to design highly efficient, application-aware reliability mechanisms that are transparent to applications. This paper describes the Reliability MicroKernel...

متن کامل

Fault Detection and Isolation of Multi-Agent Systems via Complex Laplacian

This paper studies the problem of fault detection and isolation (FDI) for multi-agent systems (MAS) via complex Laplacian subject to actuator faults. A planar formation of point agents in the plane using simple and linear interaction rules related to complex Laplacian is achieved. The communication network is a directed, and yet connected graph with a fixed topology. The loss of symmetry in the...

متن کامل

Radiosynthesis of 191Os-2-acetylpyridine thiosemicarbazone complex, as an in vivo therapeutic radionuclide generator

Introduction: Due to the anti-proliferative properties of platinum group-thiosemicarbazone complexes, the production of 191Os-labeled 2-acetyl pyridine 4-N-methylthiosemicarbazone (191Os-APMTS) was investigated. Methods: [191Osmium (T½= 15.4d) was produced via the 190Os(n,γ)191Os nuclear reaction using enriched target irradiated...

متن کامل

Quantum Mechanical Calculations of Photovoltaic and Photoelectronic Properties of Oligoselenophene/Fullerene BHJ Solar Cells

To model the active layer in the hetero-junction solar cells, the C60, C70, PC60BM, PCBDAN fullerenes as acceptor, and (OS)n=1) oligoselenophenes as donor were considered. The (OS)n=14/C60, (OS)n=14/C70, (OS)n=14/PC60BM, and (OS)n=14/PCBDAN blends as a model of the active layer in the BHJ solar cell were chosen, and the optoelectronic properties were studied. The calculated efficiency of these ...

متن کامل

ObjectAgent for Robust Autonomous Control

The ObjectAgent system is being developed to create a robust software architecture for autonomous control of complex systems. Agents are used to implement all of the software functionality and communicate through simplified natural language messages. These agents have a set of basic survival skills that monitor for internal software faults, providing low-level fault detection and recovery. High...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCCBS

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2011